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ABSTRACT 

Chromatin organization and composition impart 
sophisticated regulatory features critical to eukary- 
otic genomic function. Although DNA sequence- 
dependent histone octamer binding is important 
for nucleosome activity, many aspects of this phe- 
nomenon have remained elusive. We studied nu- 
cleosome structure and stability with diverse DNA 
sequences, including Widom 601 derivatives with 
the highest known octamer affinities, to establish a 
simple model behind the mechanics of sequence 
dependency. This uncovers the unique but unex- 
pected role of TA dinucleotides and a propensity 
for G|C-rich sequence elements to conform ener- 
getically favourably at most locations around the 
histone octamer, which rationalizes G|C% as the 
most predictive factor for nucleosome occupancy 
in vivo. In addition, our findings reveal dominant 
constraints on double helix conformation by H3-H4 
relative to H2A-H2B binding and DNA sequence 
context-dependency underlying nucleosome struc- 
ture, positioning and stability. This provides a basis 
for improved prediction of nucleosomal properties 
and the design of tailored DNA constructs for chro- 
matin investigations. 

INTRODUCTION 

Nucleosome positioning and occupancy underlie regula- 
tion of the genome (1,2), yet in contrast to sequence- 
specific DNA binding proteins, core histones have 
evolved to largely minimize sequence dependency in the 
nucleosome. Nonetheless, the tight wrapping around 
the histone octamer gives rise to an indirect readout of 
the DNA, whereby sequence-dependent attributes 



contribute to the free energy of nucleosome formation in 
the absence of base-specific protein contacts (3-6). 
Together with other factors like chromatin remodelling 
activities, the sequence dependency influences nucleosome 
dynamics and positioning, which contribute to genomic 
regulation by inhibiting or facilitating access to target 
DNA and protein sites. Although many, sometimes con- 
flicting, models have been proposed in the past, the 
general attribute of %G|C content has recently emerged 
as the most decisive factor for predicting nucleosome oc- 
cupancy in vivo (7,8), and yet the basis for this relationship 
is not obvious. 

Structural and thermodynamics studies have shown that 
the sequence dependency of histone octamer binding 
arises from an interplay of sequence-specific DNA con- 
formation and flexibility with the orientation of the 
major and minor grooves relative to the histone binding 
sites (3,5,6,9,10). Although the cumulative effect of DNA 
sequence alterations on binding affinity over the entire 
histone octamer can be substantial, the contribution of 
individual nucleotide changes is generally too small to be 
measured experimentally. Moreover, unique behavioural 
features of the double helix in the nucleosomal state 
(5,11,12) and the lack of detailed structural information 
over diverse DNA sequences have precluded elucidating 
the rules governing positioning and stability. Here, we 
present a nucleosome crystal- and solution-state structure 
analysis with stability measurements over nine different 
positioning DNA fragments, which has revealed a set of 
unifying mechanical principles behind histone octamer 
affinity for the double helix. 

MATERIALS AND METHODS 

Nucleosome production, structure solution and analysis 

Design of expression constructs for DNA fragment 
production and generation of NCP from recombinant 
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Table 1. Data collection and refinement statistics 
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a X-ray wavelength used for data collection: 0.80A, NCP-601L, 0.9A, NCP-TA2, 1.07 A, NCP146b. Values in parentheses are for the 
highest-resolution shell. b For NCP-TA2, additional reflections to 2.07 A were used in refinement and for calculation of electron density maps 
(Emerge = 45.2% and completeness = 15.4% for the 2.07-2.18A shell). 



Xenopus laevis histones were carried out as described pre- 
viously (10,12,13). NCP-601L, NCP-TA2 and NCP146b 
crystallization, crystal stabilization, diffraction analysis 
and structural solution were conducted as reported 
before (10,12,14). Single crystal X-ray diffraction data 
were recorded at the Swiss Light Source (Paul Scherrer 
Institute, Villigen, Switzerland) using the PILATUS 
detector on beam line X06SA and a Mar225 CCD 
detector on beam line X06DA. 

X-ray data were processed with MOSFLM (15) and 
SCALA from the CCP4 suite (16). Structural refinement 
and model building were carried out with routines from 
the CCP4 suite. NCP-601L, NCP-TA2 and NCP146b 
structures were solved by initial rigid body refinement 
using earlier NCP models (Table 1) (4,12,14). The 
NCP147, NCP146, NCP145, NCP-TA and NCP-601 
structures that were used in analyses and comparisons 
were those previously reported (4,10,12,14). 

For structural comparisons, NCP models were 
superimposed by least-squares fitting of the main chain 
atoms from the histone-fold domains (H2A, residues 
23-91; H2B, residues 33-99; H3, residues 60-132; H4, 
residues 27-94). The identification of dinucleotide steps 
as major groove-inward, minor groove-inward and 
pressure points was established by the cosine of 
accumulated twist (CAT) values that indicate groove 
orientation relative to the histone binding surface (5). 
The accumulated twist values are calculated by summa- 
tion, relative to the central base pair at the nucleosome 2- 
fold axis, of the local dinucleotide twist values associated 
with the step midpoints. 

The phosphate groups comprising the binding plat- 
forms correspond to those from the nucleotide pairs on 



the inside of the superhelix that span the major-to-minor 
groove-inward transitions (black and orange phosphate 
pairs in Figure 1). The nucleotide numbers of the 
binding platform phosphate groups can be derived from 
the histone-DNA register assignments in Figure IB: 
NCP145, chains I and J, -54, -53, -43, -42, -33, 
-32, -23, -22, -13, -12, -3, -2, 7, 8, 17, 18, 27, 28, 
37, 38, 47, 48, 58, 59; NCP-601L, chains I and J, -54, 
-53, -44, -43, -34, -33, -24, -23, -13, -12, -3, -2, 
7, 8, 17, 18, 28, 29, 38, 39, 48, 49, 58, 59; NCP146b, chain 
I, -54, -53, -44, -43, -34, -33, -24, -23, -13, -12, 
-3, -2, 7, 8, 17, 18, 28, 29, 38, 39, 48, 49, 59, 60, chain J, 
-55, -54, -44, -43, -34, -33, -24, -23, -13, -12, -3, 
-2, 7, 8, 17, 18, 28, 29, 38, 39, 48, 49, 58, 59; NCP147, 
chains I and J, -55, -54, -44, -43, -34, -33, -24, -23, 
-13, -12, -3, -2, 7, 8, 17, 18, 28, 29, 38, 39, 48, 49, 59, 
60. For the double helix binding variability between NCP 
constructs analysis (Table 2), the positional deviation 
comparison sets entailed the 24 phosphorous atoms of 
the binding platforms from each H3-H4 tetramer and 
H2A-H2B dimer pair. 

Dinucleotide step conformational parameters and re- 
constructions were obtained using 3DNA (18,19). 
Double helix axes and helical curvature values were 
calculated with Curves (20). Graphic figures were 
prepared with PyMOL (DeLano Scientific LLC, San 
Carlos, CA, USA). 

Tyrosine fluorescence spectroscopy 

Nucleosome stability was measured by monitoring 
the increase in fluorescence, which results from a loss in 
quenching of histone tyrosine residues by proximal DNA 
bases as DNA-histone interactions are disrupted (21-23). 
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Figure 1. Double helix association, conformation and positioning on the histone octamer. (A and B) Minor and major groove-inward-facing regions 
are orange and black, respectively, with 'pressure points' at minor groove-inward centres highlighted gold. Histone proteins are blue, H3, green, H4, 
yellow, H2A and red, H2B (DNA-binding motifs: L, loop, A, a-helix). (A) Section of the NCP-601L crystal structure with phosphorous atoms of the 
'binding platforms' shown as spheres. Bound single-strand regions act as a 'hinge', allowing conformational variation between different DNA 
sequences. (B) NCP constructs are arranged in order of increasing salt stability. Severe kinks at locations of DNA stretching around SHL ±2 
or ±5 (magenta underlines), associated with a single base pair shift in histone-nucleotide register, are depicted as gaps in the sequence. 
DNA-permanganate reactivity hotspots in the nucleosomal state from footprinting analysis (six constructs) are indicated with green asterisks. 
Sites where the nucleosomal DNA shows reduced permanganate reactivity relative to the naked state are indicated with blue arrowheads. 
Capitalized bases in the Widom consensus sequence represent the most highly conserved nucleotides (17). The histone-DNA register assignments 
for NCP-601 R and the Widom consensus sequence, for which crystal structures are not available, were inferred from the structures of NCP-601 and 
NCP-601L. 



Fluorescence measurements were conducted with a Varian 
Cary Eclipse fluorescence spectrophotometer (Agilent 
Technologies, Santa Clara, USA) equipped with a tem- 
perature control unit. Samples of 0.5 uM NCP were 
allowed to equilibrate at 25°C in a buffer of 20 mM Tris 
(pH 7.5), ImM EDTA, 1 mM dithiothreitol and NaCl 
ranging in concentration from OM to 2.6 M. 
Fluorescence readings were taken at 20° C with samples 
in Teflon stopper cuvettes having 1-cm path length. 
Tyrosine fluorescence was measured by the emission at 
306 nm through excitation at 275 nm. 

Fluorescence measurements were carried out in tripli- 
cate and values for each data set were placed on a relative 
scale by setting the minimum and maximum values over 
the 0-2.6 M range to 0 and 1, respectively. The three 
normalized data sets for each construct type were then 
merged into one by averaging. Optimized sigmoidal 
profiles were fit to the merged data sets using OriginPro 



8.1 (OriginLab Corp., Northampton, MA, USA), from 
which 50% dissociation points with respect to NaCl con- 
centration were derived. 

DNA distortion analysis by permanganate footprinting 

Samples of 2.5 uM DNA or NCP in a buffer of 20 mM 
K-cacodylate (pH 6.0) were incubated with 1 mM KMn0 4 
in the dark for 60min at room temperature. A 10% molar 
excess of dTTP to KMn0 4 was added to the samples 
followed by incubation for 15 min to quench the reactions. 
Samples were subsequently treated with the addition of 
NaCl solution to a final concentration of 2M, followed 
by phenol-chloroform extraction and ethanol precipita- 
tion of the DNA, which was then 5' end-labelled with 
polynucleotide kinase. Thermally induced strand 
cleavage at KMn0 4 -modified bases was effected by 
30 min incubation of 10 ul samples at 99°C, followed by 
a further 30 min incubation after addition of 100 ul of 10% 
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Table 2. Variability in double helix binding by the H3-H4 tetramer 
and H2A-H2B dimers 



Histone Sites 
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NCP-601L 
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H3-H4 


0.48 ± 0.23 






H2A-H2B 


1.16 ± 0.62 
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0.48 ± 0.25 
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0.52 ± 0.27 


Values (A) are 


average differences 


in phosphate 


group positioning 



within the binding platforms between two NCP structures. The global 
average associated with all comparisons is 0.47 ± 0.25 A for H3-H4 
sites and 0.92 ± 0.50 A for H2A-H2B sites. 



Table 3. Comparison of permanganate reactivity between the highest 
and lowest stability NCP constructs 
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22.6 (3.3) 


15.7 (5.1) 



Values correspond to reactivity hotspot intensities in the nucleosomal 
state with corresponding values for the naked state in parentheses. 
"Note that two hotspot locations are at major groove-inward positions 
for NCP146b. 

See Figure IB for overview. 



(v/v) piperidine with 50 mM EDTA (pH 8.0). Maxam- 
Gilbert purine sequencing standards (24) were prepared 
as markers. DNA fragments were resolved by denaturing 
PAGE (10% polyacrylamide, 8M urea, 88 mM Tris- 
borate, 2mM EDTA, pH 8.3), in which loaded quantities 
of DNA samples corresponded to approximately equal 
total radioactive counts. 

The Quantity One program (Bio-Rad Laboratories) was 
used for intensity quantification of gel bands (Table 3). 
The reactivity hotspot intensities for the nucleosomal 
state (Figure 7B, lanes 3 and 9), and the corresponding 
locations in the naked DNA state (lanes 2 and 8), were 
placed on a relative scale by normalizing with respect to 
the band intensity associated with nucleotide 5 
(SHL ±0.5) of NCP-601L (lane 3). For SHL sites contain- 
ing more than one reactive nucleotide, normalized 
intensities were summed to yield the values given in 
Table 3. 



RESULTS 

Nucleosome structure with the highest affinity DNA 
sequences 

Most nucleosome core particle (NCP) crystal structures 
have been based on a single type of human a-satellite 



sequence (25), but the discovery of DNA fragments with 
the highest known affinity for the histone octamer, by se- 
lection from a pool of synthetic random sequences (17,26), 
affords an opportunity to decipher the precise mechanics 
of DNA wrapping. Structures with the most widely used 
construct derived by this approach, the Widom 601 strong 
positioning sequence, have recently been solved and 
analysed (12,27,28), but they do not provide an unambigu- 
ous picture of the DNA conformation due to the 
non-palindromic nature of the fragment. To overcome 
the problem of orientational mixing of non-identical se- 
quences between the two pseudo-symmetry-related NCP 
halves in the crystal, we generated palindromic derivatives 
of the left and right halves o of the 601 sequence, 60 1L and 
60 1R. We obtained a 2.2 A resolution crystal structure of 
NCP assembled with 601L (NCP-601L; Table 1 and 
Supplementary Figure SI), although crystals of 
NCP-601 R diffracted only very poorly. In combination 
with new high-resolution structures for NCP146b and 
NCP-TA2, which are composed of a distinct a-satellite 
sequence (4) and derivative, respectively, the histone- 
DNA register established for the three 601 and six centro- 
meric constructs permits unprecedented insight into 
sequence versus position relationships in the nucleosome 
(Figure 1). 

Alignment of the highest affinity sequences derived by 
SELEX allowed Widom and colleagues to predict a con- 
sensus sequence for maximum affinity histone octamer 
binding over the central 73 bp of the nucleosome (this 
approach apparently did not allow full optimization of 
the outer H2A-H2B binding sites; Figure IB) (17). Our 
60 1L is very similar to this consensus sequence, and in 
fact, NCP-601L displays extraordinary salt stability, 
which is substantially above that of the parent 601 
particle (NCP-601; Figure 2 and Supplementary Figure 
S2). Conversely, the stability of NCP-601 R is significantly 
reduced relative to NCP-601, although it is still above that 
of the a-satellite-based NCPs, which all form a cluster 
with similar stabilities. 

The general attributes that appear to endow the 60 1L 
(like the consensus) sequence with exceptional histone 
octamer affinity have been previously outlined (12). 
These include the presence of the most flexible dinucleo- 
tide type, TA (29), at minor groove-inward positions 
where DNA distortion is energetically most challenging 
(30,31) and G|C-rich elements, which are predisposed to 
major groove bending/compression, at major 
groove-inward positions. Moreover, 601L and the other 
601 sequences contain TTTAA elements at minor 
groove-inward locations situated 1.5 double helix turns 
from the nucleosome centre (SHL ±1.5). This coincides 
with the most stringent singular positioning signal in the 
nucleosome (9), whereby extreme narrowing of the minor 
groove is required (10), and a-satellite derivatives engin- 
eered with TTTAA elements at SHL ±1.5 (NCP-TA and 
NCP-TA2) also display increased stability over their 
parent constructs (Figure 2 and Supplementary Figure 
S2). Nonetheless, the reason why TA steps in the highest 
affinity sequences occupy such uniquely defined positions 
within the different minor groove-inward locations 
(Figure IB) is not clear. Moreover, as apparent from the 
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crystal structure of NCP-601L, the TA elements do not 
take up the strongly kinked conformations expected based 
on the behaviour of highly flexible CA = TG dinucleo- 
tides in the earlier a-satellite structures (Figures 3 
and 4). Additionally, the high stability of NCP-601R in 
spite of having GGG(G) elements over six of the eight 
minor groove-inward locations around the nucleosome 
centre (Figure IB) blatantly conflicts with the general pref- 
erence of G|C sequences to take up the opposite orienta- 
tion of major groove-inward (3,6,17). 

Mechanical model for DNA sequence dependency 

Previous models for the sequence-dependent mechanics of 
wrapping in the nucleosome have been based on the con- 
tribution of specific DNA conformational parameters, 
foremost base pair step roll and slide (Figure 3A for an 
illustration of the six dinucleotide step parameters), 
towards generation of the superhelix (5,32). However, 
this is only an indirect description that does not address 
the underlying localized constraints on double helix struc- 
ture, which moreover arise from a form of protein associ- 
ation that is unique to the nucleosome. Histone binding 
involves mainly interaction with the inward-facing 
phosphodiester backbone. In particular, for every turn 
of the double helix there are four phosphate groups that 
are in closest proximity to the histone octamer surface, 
encompassing the most extensive protein-DNA inter- 
actions. The phosphorus atoms of these four phosphate 
groups, designated as the 'binding platform', lie roughly in 
a plane and show the least variation in position between 
different DNA sequences (Figures 1A and 5). Thus, the 
binding platform represents the most constraining feature 
of histone association, which requires a very narrow minor 
groove for both DNA strands to fit on the protein surface. 
Correspondingly, the major groove must be narrow where 
it faces inward in order for both strands to fit onto 



adjacent binding platforms. Furthermore, the two sides 
of the binding platforms each act like a 'hinge', which 
allows freedom in the positioning of the opposing, 
unbound DNA strand (Figures 1A, 5 and 6). This attri- 
bute permits substantial conformational variation 
between different DNA sequences, limiting the sequence 
dependency of the nucleosomal system. 

In comparing double helix conformation between the 
three NCP crystal structures having the least DNA 
sequence similarity (NCP147, NCP146b and NCP-601L), 
which also display the same histone-nucleotide register 
over the central ~ 100 bp due to the absence of DNA 
stretching (Figures IB and 6 for illustrations of stretching) 
around the SHL ± 2 locations, it is evident that histone 
association both imposes substantial constraints and 
allows a significant degree of freedom (Figures 3 and 5). 
The conformational freedom relates largely to swivel-like 
alterations about the hinges while upholding the con- 
straints of the binding platforms, and the most striking 
example of this corresponds to distinctions between 
NCP145 and NCP147, which, in spite of near DNA 
sequence identity, differ dramatically in double helix 
structure over the SHL ± 1 to SHL ± 2 regions (Figure 
6). Whereas NCP147 displays smooth bending over 
these regions, DNA stretching occurring in NCP 145 is 
accompanied by extreme kinking into the major or the 
minor groove. These dramatic conformational differences 
give rise to a distinct distribution of double helix curvature 
between hinges, but coincide nonetheless with nearly iden- 
tical phosphate group positioning within the H3-H4 
binding platforms. On the other hand, sequence-specific 
differences in histone binding over the other region of po- 
tential DNA stretching, SHL ±4.5 to SHL ±5.5, have 
been reported previously (4,27), and a substantial variabil- 
ity in H2A-H2B association over this location is evident 
from comparison of the most diverse NCP constructs 
(Figures 3 and 5). 
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The H3-H4 tetramer dominates sequence dependency 

Analysis of non-covalent interactions in the NCP crystal 
structure (4) and mechanical unzipping of the nucleosome 
(33) indicated that DNA-histone interaction strength is 
greater over the H3-H4 tetramer compared to the H2A- 
H2B dimers. This suggests that the tetramer may impose 
stronger constraints on DNA conformation, which in turn 
could disproportionately affect stability and positioning. 
When we analyzed here variations in phosphate position 
within the binding platform between the NCP structures 
with the most diverse DNA sequences and DNA 
stretching configurations, we found that there is in 
general substantially greater conformational freedom 



over the H2A-H2B versus the H3-H4 sites (Table 2). 
This indicates that the DNA sequence content associated 
with the H3-H4 binding surface would have a dominant 
influence on positioning and stability. Correspondingly, 
H3-H4 association could impose discrimination between 
the two distinct forms of minor groove bending. However, 
examples of isolated kinks versus smooth bending 
accompanied by alternating shift, and indeed mixtures of 
the two, can be seen to substitute for one another at indi- 
vidual H3-H4 and H2A-H2B sites depending on the 
DNA sequence (Figures 3 and 6). Thus, the double helix 
seems free to adopt one of a subset of favourable conform- 
ational modes within the confines of the binding 
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Figure 3. Dinucleotide step parameters. (A) Illustration of the six degrees of freedom for DNA structure at the base pair step level. (B) Dinucleotide 
step values for NCP-601L (blue) and NCP147 (green) averaged over one particle half and for the two particle halves of NCP146b (red and yellow; 
NCP146b displays a distinct DNA-histone register in each half). Dinucleotide steps in major groove-inward sections in addition to the flanking 
major-to-minor groove-inward interface steps have a grey shaded background. The four dinucleotide steps in each minor groove-inward section have 
a white background with a gold shading indicating the step located at the pressure point. 

(continued) 



6344 Nucleic Acids Research, 2012, Vol. 40, No. 13 



B 



SHL ±0.5 ±1.5 ±2.5 ±3.5 ±4.5 

±10 +20 +30 ±40 ±50 



±5.5 



±6.5 



Figure 3. Continued. 



±60 



±70 




±10 



±20 



±30 



±40 



±50 



1 1 1 

±60 ±70 



platforms. At the same time, however, there appear to be 
histone-specific distinctions that predispose a particular 
bending mode. This is most evident at the SHL ± 2.5 
H3-H4 site that consistently displays a striking alternating 
shift pattern, wherein defined histone interactions could 
select for the distinct BII phosphate backbone configur- 
ation, which is notably enriched by this conformational 
mode (Figures 3 and 4) (5). 

Minor groove-inward regions dominate 
sequence-dependent attributes 

As suggested previously (10) and consistent with recent 
computational studies (30,31), our data here illustrate 
that the sequence content of minor groove-inward 
sections dominates positioning and stability. Major 
groove-inward sections display greater conformational 
freedom in the crystal structures and their sequence 
content is more variable (Figures IB, 3 and 5). Although 
systematically G|C-rich, these regions are less conserved in 
the highest affinity sequences compared to the minor 
groove-inward sections (see Widom consensus; 
Figure IB). Moreover, the major groove-inward sections 



of the a-satellite sequences are in fact A|T-rich, although 
highly flexible, centrally located CA = TG and TA di- 
nucleotides are likely a stabilizing factor. Nonetheless, 
the optimal major groove-inward sequences consist 
mostly or purely of G|C nucleotides and the ideal motifs 
apparently correspond to alternating dinucleotide types, 
wherein the flexibility inherent in CG and GC steps is 
seemingly stabilizing. 

In contrast to the nominal conformational character of 
major groove bending in the nucleosome core, the minor 
groove-inward sections display specialized modes of dis- 
tortion (5,10). Although the TA dinucleotides in the 
NCP-601L structure coincide with the points of maximal 
bending into the minor groove, the bending mode over the 
different sites is mostly smooth with low curvature, and 
even in the case of the kinking observed at SHL ± 3.5, the 
magnitude (of roll) is much reduced relative to that at 
CA = TG dinucleotides in the a-satellite constructs 
(Figure 3). On closer inspection, however, the TA 
elements display unusual distortions not obvious from 
the dinucleotide step parameters alone (Figure 4). This is 
consistent with permanganate footprinting analysis of 



Nucleic Acids Research, 2012, Vol. 40, No. 13 6345 





Figure 4. DNA binding and structure over TA elements in NCP-601L. (A-D) TA dinucleotides (bases, space filling dots) are situated at the pressure 
points of the centrally located histone binding sites, where they display a diversity in conformational distortions. Rotational dislocation of bases 
within the TA step (curved arrows, only front strand shown for clarity) permits unabated compression of the minor groove for fitting of the binding 
platform (phosphorous atoms, spheres) to the histone surface. DNA-histone hydrogen bonds appear as black dashed lines. Permanganate reactivity 
hotspots are designated with asterisks, whereby that at SHL ±2.5 is by far the most prominent as a consequence of extreme base unstacking 
promoted by base pair displacement into the minor and major grooves via shift (C, arrows). Note that shift is also the primary DNA structural 
parameter influencing platinum drug reaction, since it dictates solvent access to the major groove edge (14). 



naked and nucleosomal DNA in solution, which reveals 
sites of double helix distortion by virtue of elevated re- 
activity of unstacked thymine (primary intrinsic reactivity) 
and cytosine (secondary intrinsic reactivity) bases towards 
permanganate ion (Figure 7 and Supplementary Figure 
S3) (9,34). The NCP DNA displays pronounced reactivity 
hotspots relative to the naked state, and almost all of these 
coincide with minor groove-inward sections (Figures IB, 
4 and 8). This emphasizes the primary importance of the 
sequence content over these sites, since the DNA is dis- 
torted greatest from its naked state conformation 
(Table 3). Furthermore, the most prominent hotspots 
coincide with H3-H4 tetramer binding sites, in particular 
SHL ±1.5 and SHL ±2.5, which are indeed the most 
conserved regions in the Widom consensus sequence. 

The origin of the special structure and function of the 
TA dinucleotides becomes clear by considering association 



of the binding platform. The conspicuous localization of 
TA at defined points of the minor groove-inward sections 
in the 601 and consensus sequences coincide with the pos- 
itions at which the minor groove is most directly facing 
inward and, consequently, under the greatest pressure to 
contribute to superhelix formation through distortions 
into the minor groove (Figures 1 and 4). These 'pressure 
points' are seemingly invariant with respect to the DNA 
sequence since they also coincide with the most inward 
facing dinucleotide step positions of NCP147, which 
notably lacks TA at any of the minor groove-inward 
facing sections (Supplementary Figure S4). In contrast 
to the more conservative behaviour of other dinucleotide 
types at the pressure points (DNA stretching notwith- 
standing), individual base pairs within the TA steps are 
distorted by extreme propeller twisting and other deform- 
ations that cause unstacking and diminish Watson-Crick 



6346 Nucleic Acids Research, 2012, Vol. 40, No. 13 



bonding (Figure 4 and Table 4). This kind of behaviour 
for A»T base pairs was described in a recent analysis (28) 
of the NCP-601 structures (12,27). For the eight TA steps 
situated at the pressure points in the NCP-601 L model, we 
find that, with the exception of propeller twist values that 
can reach —48° and average —25 ± 14°, the five other base 
pair parameters are on average close to ideal (near zero 
values; Table 4). This contrasts with the behaviour at the 
corresponding pressure point steps in NCP147 and 
NCP146b, which are non-TA and have a significantly 
greater tendency for a particular directionality of base 
pair distortion. What is remarkable for the TA step base 
pairs in NCP-601 L is that they display a degree of distor- 
tional variation that is well above that of the correspond- 
ing base pairs of NCP147 or NCP146b. 

The pronounced dinucleotide step and base pair distor- 
tions associated with the pressure point TA steps results in 



a nucleotide from one or both DNA strands being to some 
extent dislocated from the double helical stack, which 
serves to collapse the minor groove for fitting of the 
binding platform to the histone surface. The type of dis- 
tortion required depends critically on the intrinsic con- 
formational preference of the sequence element, whereby 
the intrinsically narrow TTTAA motifs at SHL ± 1.5 
display only minimal bending into the minor groove 
(Figures 1A, 3 and 4). In fact, when NCP145 is modified 
to have these motifs (yielding NCP-TA2; Figure IB), such 
that the TA step is placed where the extreme 
stretch-induced kink was present at the SHL ±1.5 
GG = CC in NCP145, the resulting kinking is much 
reduced because the binding platform would otherwise 
be overly narrow (Figure 9). In this regard, the most 
severe kinks at minor groove-inward regions occur at 
sequence elements having G|C character, and therefore 





Figure 5. Constraints of histone binding on double helix structure and site-dependent variation between different DNA sequences. (A-D) The 
structures of NCP147 (magenta), NCP146b (yellow) and NCP-601L (cyan) were superimposed via the histone-fold regions of the octamer (DNA 
binding motifs: L, loop, A, a-helix). The phosphorous atoms of the binding platforms appear as large spheres, and water molecules mediating a 
conserved DNA-histone hydrogen bond bridge are shown with small spheres (A and C; CWB). (A and C) H3-H4 tetramer binding sites. (B and D) 
H2A-H2B dimer binding sites. 

(continued) 
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arise from a combination of DNA positioning preferences 
and local groove narrowing requirements that together 
determine the incidence of double helix stretching 
(Figure IB). 

DISCUSSION 

DNA sequence-dependent nucleosome positioning and 
stability are governed by the energetic cost of deforming 
the double helix to fit the histone binding surface. In this 
regard, the histone octamer provides a roughly rigid 
scaffold to which the DNA must adapt for association 
(Figure 5). The structural constraints on the double helix 
are greatest over the H3-H4 tetramer binding sites, and 
the conformational challenges towards deformation are 
maximal for minor groove-inward facing regions, which 
make the DNA sequence at these locations dominate 
histone octamer affinity. Considering our data here and 
previous findings on histone site-dependent contributions 
to nucleosome positioning and stability, as well as the 



variation in degree of nucleotide conservation in the 
Widom consensus sequence (9,10,17) (Figure IB), allows 
us to infer a tentative ranking for the relative importance 
of the different minor groove-inward sections towards 
histone octamer affinity: SHL ± 1.5 > SHL ±2.5 > 
SHL ±0.5 > SHL ± 3.5 > SHL ±4.5 > SHL ±5.5. 

The interplay between DNA sequence, deformation and 
affinity for the histone octamer can be illustrated by 
analysis of the constructs with highest and lowest stability 
in our study, NCP-601L and NCP146b, which display a 
nearly 2-fold difference in salt stability (Figure 2). For 
NCP-601L, permanganate reactivity hotspots are limited 
to the four different TA dinucleotide sites surrounding the 
nucleosome centre (SHL ±0.5, ±1.5, ±2.5, ±3.5) in 
addition to the location of stretch-induced extreme 
kinking at the GA = TC step of SHL ±4.5 (Figures IB 
and 7). Positioning of TA steps at these dominant pressure 
points allows for a minimum distortion energy, because 
TA is the loosest stacking dinucleotide step in conjunction 
with the weaker base pairing interaction for A»T versus 
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Figure 6. DNA single-strand histone attachment points act as hinges to allow double helix conformational freedom. (A and B) DNA stretching 
around SHL ± 1 to SHL ± 2 in NCP145 (cyan) gives rise to substantial structural differences compared to NCP147 (magenta), which displays no 
stretching but is composed of a DNA with nearly identical sequence. Histone association with the binding platforms (phosphorous atoms, spheres) is 
very similar between the two constructs, whereby the extreme stretch-associated kinking in NCP145 (bases, space-filling dots) at either SHL —1 (A; 
CA = TG, roll = 38°, rise = 5.0 A) or SHL 1.5 (B; GG = CC, roll = —52°, rise = 5.7 A) is accommodated largely by swivel-like repositioning of the 
double helix (curved arrows) about the hinges (brackets). This results in a distinct distribution of helix axis (tubes) curvature between the major and 
minor groove-inward regions (values shown). 



G»C (29). In fact, the next loosest stacking step, 
CA = TG, has nearly twice the stacking energy as TA 
and moreover contains a G«C pair. In this regard, as 
gleaned from the NCP structures and permanganate re- 
activity, the TA step is free to distort in a great variety of 
ways, at both the step and base pair levels, for low energy 
fitting to the octamer. In fact, the overall degree of DNA 
distortion in the nucleosomal state relative to the naked 



state, estimated by the differential permanganate reactiv- 
ity, is larger for NCP-601L compared to NCP146b 
(Table 3). Therefore, the deformation of the 146b DNA 
from fitting on the octamer is actually less than that of the 
60 1L DNA, while the resulting energetic cost of distortion 
must be substantially greater because of the suboptimal 
146b sequence. Moreover, in contrast to NCP-601L and 
the other constructs, NCP146b displays reactivity 



Nucleic Acids Research, 2012, Vol. 40, No. 13 6349 



A NCP147 NCP145 NCP-TA2 B NCP-601L NCP-601R NCP146b 
mDN mDNmDN mDN mDNmDN 




Figure 7. KM11O4 footprinting reveals DNA context-dependent distortions. (A and B) DNA samples, comprising six different constructs, correspond 
to purine sequencing standard (m) or naked DNA (D) and NCP (N) that were subjected to permanganate reactivity analysis. Minor 
groove-inward-facing nucleotides (highlighted in orange), regions of DNA stretching (magenta arrows; dashed for mixed stretched and non-stretched 
configurations) and the central nucleotide (blue dot) are based on the crystal structure assignments (Figure IB). 



hotspots also at SHL ± 2 and ± 5, which correspond to 
pronounced deformations relative to the naked state at 
major groove-inward regions (Table 3). In this sense, 
there would be an additional energetic expenditure to 
deform A|T elements of 146b for major groove 
bending — a type of conformation that is easily accessible 
to most G|C sequences. 



Although TA dinucleotides are ideal for the mechanics 
required at pressure points, the optimal sequence context 
can vary over the different histone binding sites. This is 
perhaps most evident at the most stringent motifs, SHL 
± 1.5 and ±2.5, for which the sequence elements are iden- 
tical between NCP-601L and the Widom consensus 
sequence and apparently optimal. The permanganate 



6350 Nucleic Acids Research, 2012, Vol. 40, No. 13 




Figure 8. G|C-rich elements can undergo energetically favourable distortions at minor groove-inward positions by adopting specialized conform- 
ations. Stereo view of the AGGGA ( = TCCCT) motif at the SHL— 1.5 location in NCP146b, which displays smooth bending with pronounced 
alternating displacement of base pairs into the minor groove (downward-pointing arrow and filled circle indicating displacement away from the 
viewer) and major groove (upward-pointing arrows) via fluctuating shift. DNA-histone hydrogen bonds appear as black dashed lines. Permanganate 
reactivity hotspots are designated with asterisks. 



Table 4. Base pair parameters of dinucleotides at pressure points 



Parameter 


NCP-601L 


NCP146b 


NCP147 


Shear (A) 


-0.08 ±0.48 


0.08 ±0.42 


-0.22 ±0.23 




-1.12, 0.76 


-0.51, 1.01 


-0.60, 0.28 


Stretch (A) 


0.03 ±0.20 


-0.15 ±0.20 


-0.12±0.13 




-0.31, 0.39 


-0.51, 0.23 


-0.38, 0.26 


Stagger (A) 


-0.16±0.35 


0.41 ±0.35 


0.30±0.29 




-0.78, 0.41 


-0.01, 1.28 


-0.48, 0.75 


Buckle (°) 


0.8±9.3 


4.1 ± 7.8 


2.7±6.0 




-10.2, 23.4 


-11.0, 13.6 


-10.4, 18.2 


Opening (°) 


1.0±4.5 


-1.7±4.0 


— 1.1 ±2.8 




-6.0, 11.0 


-8.6, 5.0 


-7.3, 2.9 


Propeller (°) 


-24.8 ± 13.8 


-19.0± 10.9 


— 17.1 ± 10.5 




-48.2, -0.8 


-35.7, -1.9 


-37.2, 2.9 



Values are averages (upper entry) and minima/maxima (lower entry) 
associated with the eight dinucleotide steps at the pressure points of 
SHL ±0.5, ± 1.5, ±2.5 and ±3.5. Shear and buckle values correspond 
to all purine»pyrimdine base pairs placed in the same orientational 
frame. 



reactivity of TA in the NCP-601L SHL ±2.5 CTAGA 
element is by far the greatest (Table 3), consistent with 
the tremendous base unstacking seen in the structure 
(Figure 4C). The TA reactivity of the SHL ±1.5 TTTAA 
element is the next strongest, but notably this site in the 
naked state has low reactivity, which denotes good 
stacking and absence of significant distortion, whereas 
the naked state CTAGA element displays substantial re- 
activity, indicating an intrinsically distorted structure. In 
this way, the TTTAA element is tailored to the extreme 
groove-narrowing requirements at SHL ± 1.5 (10), while 
CTAGA appears optimal for the extreme alternating shift 
pattern favoured at SHL ±2.5. The energetic cost of 
unstacking a dinucleotide from the double helix is 
governed by the identity of the flanking nucleotides, 



which for TA is minimal when N]TAN 4 equals CTAG 
(29). In this regard, flanking AG = CT steps are not 
only ideal from the standpoint of the loosest stacking 
environment for TA (29), but also they provide the 
most shift-flexible context for an N[TAN 4 tetra- 
nucleotide (35). 

With the exception of the requirement for soft spots at 
every turn of the double helix best fulfilled by TA 
elements, the highest affinity sequences are decisively 
G|C-rich, with the 60 1L and Widom consensus 
composed of only 43.4 and 45.2% A|T nucleotides, re- 
spectively (Figure 2). This could help explain the correl- 
ation between G|C% and nucleosome occupancy in vivo, 
but maximizing histone octamer affinity by selection 
in vitro (17,26) represents the greater good of sequence 
space, and such sequences have not been identified in 
genomic analyses. In contrast, the contribution of DNA 
sequence towards nucleosome positioning in vivo appears 
to be largely governed by the lesser of available evils, 
which notably includes octamer exclusion from poly-A:T 
elements (1,2,36). Indeed, the fact that both minor and 
major groove-inward sections rich in either A|T or G|C 
nucleotides can yield well-positioning sequences 
(Figure IB) would weaken generalized correlations, 
contributing to the lack of agreement between some pos- 
itioning studies. 

A key consideration behind the stabilizing effect of G|C 
content is the contribution of specific sequence elements 
that are of sub-optimal favourability with respect to minor 
groove bending. For the most important minor 
groove-inward sections, the 'second best' motifs evident 
here are composed mostly of G|C nucleotides, the 
SHL ±2.5 AGGGG, SHL ±0.5 GGGGA and 
SHL ±3.5 AGGGA elements common to the second 
and third most stable constructs in our study, NCP-601 



Nucleic Acids Research, 2012, Vol. 40, No. 13 6351 




Figure 9. The intrinsic conformation of the DNA sequence dictates the structural changes required for histone association. NCP145 (magenta) and 
NCP-TA2 (cyan) both undergo DNA stretching around SHL 1 to SHL 2, but differ in DNA sequence over the SHL ± 1.5 region. The extreme 
kinking at GG = CC (bases, space-filling dots; roll = —52°, rise = 5.7 A) in NCP145 serves to massively elongate (stretch, double-headed arrows) the 
DNA and simultaneously narrow the minor groove of the G|C-rich element, GCCTT. In contrast, however, the TTAAA element of NCP-TA2 has 
an intrinsically narrow minor groove (compare SHL 1.5 phosphodiester backbone regions facing viewer) and therefore only exhibits a modest 
stretch-associated kink at TA (bases, space-filling dots; roll = —17°, rise = 4.2 A) as the minor groove would otherwise be overly narrow for histone 
association of the binding platform (phosphorous atoms, spheres). The remainder of the DNA stretch for NCP-TA2 is distributed over several 
dinucleotides towards SHL 2. 



and NCP-601R (Figure IB). Moreover, the H3-H4 
tetramer binding sequences of the non-601 (a-satellite) 
constructs are also largely G|C, composed predominantly 
of the same step types, GG = CC, AG = CT and 
GA = TC, in addition to GC. Without consideration of 
specific sequence context, these dinucleotide types appear 
less flexible than pyrimidine-purine steps with respect to 
minor groove bending (30,31). However, together with 
AG = CT, the G|C steps are the most flexible towards 
shift (in order of decreasing flexibility: CG, GG = CC, 
AG = CT, GC) (35). Although GA = TC is ranked as 
the least shift-flexible step type, it displays an average 
shift value (—0.28 ± 0.46 A for GA) of greater magnitude 
than any of the other step types (35). In this regard, it is 
important to note that the phases of the alternating shift 
patterns in the minor groove-inward regions appear to be 
conserved (Figure 3B) and thus constrained by histone 
binding, meaning that a shift-rigid step like GA = TC 
can also contribute favourably when appropriately pos- 
itioned. Indeed, the GA steps in NCP-601L, NCP147 
and NCP146b tend to adopt only negative shift values 
that are on average —0.47 ±0.58 A (the corresponding 
value for GG steps is by comparison 0.07 ± 0.70 A), and 
in this way, they can support minor groove bending espe- 
cially at the periphery of minor groove-inward regions, 
such as that at the NCP-601L SHL ±2.5 CTAGA or 
NCP146b ±1.5 AGGGA motif. In this manner, 
shift-flexible as well as shift-biased dinucleotide steps in 
G|C-rich motifs can work cooperatively to allow 



favourable transition into a specialized form of minor 
groove bending (Figure 8). 

The mechanical model and DNA sequence 
dependencies presented here provide a new scheme for 
understanding nucleosome structure, dynamics and stabil- 
ity. For instance, considering the specific constraints 
imposed by protein association together with conform- 
ational preferences of polynucleotide elements (e.g. five 
base pair sections) would yield improved methods for pre- 
dicting histone octamer-DNA affinity. In addition, the 
ability to design tailored high affinity histone octamer 
binding sequences from knowledge of the rules governing 
stability will assist the next generation of studies on nu- 
cleosomal assemblies. 
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